Re-embedding of watermarks in multimedia signals

ABSTRACT

Methods and apparatus for processing a multimedia signal comprising a watermark signal are described. The method includes the steps of: removing at least a portion of the watermark signal, and adding a new watermark signal to the multimedia signal so as to form a new watermarked multimedia signal.

The present invention relates to apparatus and methods for re-embeddinginformation in multimedia signals, such as audio, video or data signals.

Watermarking of multimedia signals is a technique for the transmissionof additional data along with the multimedia signal. For instance,watermarking techniques can be used to embed copyright and copy controlinformation into audio signals.

The main requirement of a watermarking scheme is that it is notobservable (i.e. in the case of an audio signal, it is inaudible) whilstbeing robust to attacks to remove the watermark from the signal (e.g.removing the watermark will damage the signal). It will be appreciatedthat the robustness of a watermark will normally be a trade off againstthe quality of the signal in which the watermark is embedded. Forinstance, if a watermark is strongly embedded into an audio signal (andis thus difficult to remove) then it is likely that the quality of theaudio signal will be reduced, if one tries to remove it without theknowledge of the underlying technique and the secret key.

The altering of a watermark by increasing the amount of embeddedinformation in a watermark signal is known. In such instances, an extrawatermark sequence is added to an existing watermarked signal. This is,for instance, implied in the 4C 12 bit watermark specification. A copyof this specification can be found athttp://www.4centity.com/data/tech/4cspec.pdf.

It will be appreciated that when the watermark information needs to bechanged repeatedly, such an approach not only degrades the quality ofthe original information signal (as additional watermark signals areadded to change the payload) but also collisions between the individualembedded watermarks significantly degrade the watermark robustness. Forexample, certain applications of copyright require that the copyrightinformation embedded in a signal is changed repeatedly in order toassert proper copy control.

It is an object of the present invention to provide a watermarkingscheme that substantially addresses at least one of the problems of theprior art, whether referred to herein or otherwise.

In a first aspect, the present invention provides a method of processinga multimedia signal comprising a watermark signal, the method comprisingthe steps of: removing at least a portion of an original watermarksignal; and adding a new watermark signal to the multimedia signal so asto form a new watermarked multimedia signal.

Preferably, said original watermark signal is removed by applying anegative version of said original watermark signal to the multimediasignal.

Preferably, the method further comprises the step of determining thevalue of at least one of the parameters used to embed the originalwatermark in the multimedia signal.

Preferably, said parameter is utilized to remove at least a portion ofsaid original watermark signal.

Preferably, said new signal is embedded in the multimedia signal usingsaid embedding parameters having said determined value.

Suitably, said new signal is embedded in the multimedia signal usingsaid embedding parameters having values other than the said determinedvalues.

Preferably, said parameter comprises at least one of: embeddingstrength, synchronization information, time offset, time-scaling, anamount of a circular shift of a sequence, and a watermark symbol period.

Suitably, all of said original watermark is removed, and the new signalcomprises a new watermark signal.

Preferably, said watermark comprises at least two sequences of values,at least one sequence of values being removed as said portion of thewatermark signal, so as to leave at least one remaining sequence ofvalues from the original watermark signal; and wherein said new signalcomprises at least one further sequence of values which together withsaid remaining sequence forms a new watermark signal.

Preferably, all of said sequences of values are formed from a singlesequence of values which has been circularly shifted by differentamounts.

Preferably, said removed portion of the watermark signal was embeddedwith a predetermined strength into said multimedia signal, the methodcomprising the step of embedding the new signal into the multimediasignal with preferably the same predetermined strength.

Preferably, the embedding strength is such that the degradation in thequality of the new watermarked multimedia signal is perceptible but notannoying.

Preferably, at least one of the original watermark signal and the newwatermark signal comprises a smoothly varying signal formed by applyinga window shaping function to a sequence of values, the integral over thewindow shaping function being zero.

Preferably, the window shaping function has a bi-phase behavior.

Preferably, the bi-phase window comprises at least two Hanning windowsof opposite polarities.

In another aspect, the present invention provides a computer programarranged to perform any of the methods described above.

In a further aspect, the present invention provides a record carriercomprising a computer program described above.

In another aspect, the present invention provides a method of makingavailable for downloading a computer program as described above.

In a further aspect, the present invention provides an apparatus forprocessing a multimedia signal comprising a watermark signal, theapparatus comprising: a deletion unit arrange to remove at least aportion of the watermark signal; and an embedder arranged to add a newsignal to the multimedia signal so as to form a new watermark signal.

Preferably, the apparatus further comprises a detector arranged todetect at least one value of a parameter of said watermark signal.

Preferably, the apparatus comprises a receiver of a multimedia signalcomprising an apparatus.

For a better understanding of the invention, and to show how embodimentsof the same may be carried into effect, reference will now be made, byway of example, to the accompanying diagrammatic drawings in which:

FIG. 1 is a diagram illustrating a generalized watermark re-embeddingapparatus according to a first embodiment of the present invention;

FIG. 2 shows a schematic diagram of one type of watermark embedder;

FIG. 3 is a schematic diagram showing the details of the payload addingand conditioning circuit (unit 630 a of FIG. 2) used for embedding aprimary watermark payload;

FIG. 4 is a schematic diagram illustrating a watermark detector that canbe used in one preferred embodiment of the invention;

FIG. 5 is a schematic diagram showing the details of watermark generatorunit 650 according to a preferred embodiment;

FIG. 6 is a diagram illustrating a watermark embedder apparatusaccording to a preferred embodiment;

FIG. 7 shows a signal portion extraction filter H;

FIG. 8 shows the payload adding/changing and watermark conditioningstage used for re-embedding a watermark;

FIG. 9 is a diagram illustrating the details of the watermarkconditioning apparatus H_(c) of FIG. 8, including charts of theassociated signals at each stage;

FIG. 10 shows a typical correlation function corresponding to theoriginal payload of a watermark signal received by the apparatus of FIG.1 and FIG. 4;

FIG. 11 shows a correlation function after removal of one of the twosequences forming the original watermark; and

FIG. 12 shows a correlation function of a new watermark having a newpayload, formed by adding a second watermark sequence to the signalhaving the correlation function shown in FIG. 10.

FIG. 1 illustrates a re-embedding apparatus 600 according to a firstembodiment of the present invention. The apparatus includes an input 602arranged to receive a watermarked information signal y′_(old). Twocopies of the input signal y′_(old) are formed, with one copy going to adelay unit 614, and the other being input to a detector 640.

The detector 640 is arranged to detect an estimate w′_(old) (output 612)of the watermark w_(old) that is embedded within the received signaly′_(old), and to estimate the control parameters (output 609) needed forchanging the watermark payload.

Information relating to the detected watermark w′_(old) is then passedto the watermark generator 650 and the watermark embedder 620. Theinformation passed to the watermark generator can be a complete copy ofthe extracted watermark w′_(old), or alternatively sufficientinformation (such as, for watermarks comprising two circularly shiftedsequences of a single series of values, the circular shift d_(old)) toallow the watermark generator to generate a copy w_(old) of w′_(old).The output 608 of the watermark generator, also denoted by w_(old), ispreferably an error corrected version of the extracted watermarkw′_(old).

The watermark generator 650 additionally generates a new watermarksignal w_(new) (output 618), for adding to the original informationsignal y′_(old).

The delay unit 614 acts to delay the input signal y′_(old) whilst theintervening operations are carried out by the detector 640, and thewatermark generator 650. The delayed signal y′_(old) is then passed(output 607) to the embedder 620.

In the watermark embedding unit 620, the old watermark signal w_(old) isremoved from the signal y′_(old), and the new watermark w_(new) added toy′_(old) so as to form an information signal y_(new) containing the newwatermark w_(new).

The above embodiment represents a particular generalized implementationof the present invention. Below is described a particular watermarkre-embedding scheme in accordance with a further embodiment of thepresent invention for use in conjunction with a particular watermarkingscheme.

Such a watermark can for example be embedded using the apparatus shownin FIGS. 2 and 3.

FIG. 2 shows an embedding apparatus 100 arranged to receive a hostmultimedia signal x at input 120, and output a watermarked version ofthe received signal (v_(old)) at output 132.

The apparatus 100 receives two sequences of values (w_(old)[k] andw_(ref)[k]) at input 110 and 112. The combination of these two sequencesof values is used to produce the watermark signal w_(o). Each of the twosequences of values is a circularly shifted version of an originalsequence of values. W_(old) is the original sequence w_(s) circularlyshifted by d_(old), and w_(ref) the original sequence w_(s) circularlyshifted by d_(ref) i.e. w_(old) is hence a circularly shifted version ofw_(ref). The original sequence of values can, for instance, be generatedusing a Random Number Generator (RNG) using a predetermined seed valueS.

As shown in FIG. 3, the payload adding and conditioning circuit 630 areceives w_(old)[k] and w_(ref)[k] at respective inputs 110 and 112.Each sequence is multiplied by a respective sign bit r_(old) andr_(ref)(received at inputs 603 and 604) using multipliers 631 (wherer_(old) and r_(ref) are respectively +1 or −1, and remain constant forany given payload). The resulting signals are then passed through signalconditioning units 632, which act to convolve each of the values withineach sequence with a window shaping function of the period T_(s). Thisproduces smoothly varying output sequences w_(old)[n] and w_(ref)[n].Operation of a signal conditioning unit 632 is described later in moredetail with reference to FIG. 9. Signal w_(ref)[n] is then delayed bydelay unit 634 by a predetermined delay T_(r) (where T_(r) is less thanT_(s), and preferably, T_(r)=T_(s)/4). Adder 635 then adds w_(old)[n]and the delayed version of w_(ref)[n], the result being the watermarksignal w_(o)[n] which is output from the circuit 630 at output 623.

Referring back to FIG. 2, the received host signal x is split into twocopies, a first copy going to adder 130, and a second copy going tomultiplication unit 124. The copy sent to multiplication unit 124 may befiltered using a band pass filter (not shown) so as to form the signalx_(b) at input 122 to the multiplier 124.

W_(o)[n] is multiplied with the possibly filtered version x_(b)[n] ofthe host signal x[n], scaled by gain factor a and added back to the hostsignal x[n] to generate the watermark signal y_(old)[n] given byy _(old) [n]=x[n]+αw _(o) [n]x _(b) [n].   (1)The gain factor a is set by the variable gain device 128, and in theexample shown is set in dependence upon the signal from the signalanalyzer 126 which samples the host signal x using a psychoacousticmodel. Such a model is, for instance, described in the paper by E.Zwicker, “Audio Engineering and Psychoacoustics: Matching signals to thefinal receiver, the Human Auditory System”, Journal of the AudioEngineering Society, Vol. 39, pp. Vol. 115-126, March 1991. The gainfactor a is normally chosen so as to minimize the impact of thewatermark signal on the host signal quality.

Such a watermarking scheme is characterized in that, during detectionthe watermarked signal y_(old) generates two correlation peaks that areseparated by pL_(old) (see FIG. 10), where the value pL_(old) is atleast part of the watermark payload, and may be defined as$\begin{matrix}{{pL}_{old} = {{{d_{ref} - d_{old}}}{{mod}\left( \left\lceil {L_{w}/2} \right\rceil \right)}}} & (2)\end{matrix}$where d_(ref) and d_(old) ε [0, L_(w)−1] are the relative positions ofthe correlation peaks as seen in FIG. 10, and L_(w) is the length(number of symbols or values) of each of the two watermark sequences.

In addition to pL_(old), extra information is also encoded by changingthe relative signs of the embedded watermarks. In the detector, this isseen as a relative sign r_(sign) between the correlation peaks. It willbe seen that r_(sign) can take four possible values, and may be definedas: $\begin{matrix}{{r_{sign} = {\frac{{2 \cdot \mu_{ref}} + \mu_{old} + 3}{2} \in \left\{ {0,1,2,3} \right\}}},} & (3)\end{matrix}$where μ_(old) and μ_(ref) are the signs of the correlation peaks, andcorrespond to the sign bits r_(old) and r_(ref) in FIG. 3, respectively.The overall watermark payload pL_(w) is then given as a combination ofr_(sign) and pL_(old):pL _(w) =<r _(sign) , pL _(old)>.   (4)The maximum information (I_(max)), in number of bits, that can becarried by a watermark sequence of length L_(w) is thus given by:$\begin{matrix}{I_{\max} = {{\log_{2}\left( {4 \cdot \left\lceil {L_{w}/2} \right\rceil} \right)}\quad{bits}}} & (5)\end{matrix}$

Below is described a watermark re-embedding apparatus suitable for usein conjunction with the watermark embedder shown in FIG. 2 and therelated watermarking scheme. In this particular embodiment, theembedding apparatus is arranged to only partially remove the watermark,and replaces it with new information so as to change the watermarkpayload. FIG. 1 illustrates the different functional blocks of there-embedder.

FIG. 4 shows a block diagram of a watermark detector 640 used to extractthe watermark embedding parameters that are needed for the re-embeddingprocess. The detector consists of four major stages: (a) the watermarksymbol extraction stage (200), (b) the buffering and interpolation stage(300), (c) the correlation and decision stage (400) and (d) the controlsignal generating stage (500).

In the symbol extraction stage (200), the received watermarked signaly_(old)[n] is processed to generate multiple (N_(b)) estimates of thewatermarked sequence, which are multiplexed into the signal w_(e)[m].These estimates of the watermark sequence are required to resolve(compensate for) any time offset that may exist between the embedder andthe detector, so that the watermark detector can synchronize to thewatermark sequence inserted in the host signal.

In the buffering and interpolation stage (300), these estimates arede-multiplexed into N_(b) separate buffers. An interpolation issubsequently applied to each buffer to resolve (compensate for) possibletimescale modifications that may have occurred. For instance, a drift insampling (clock) frequency may result in a stretch or shrink in the timedomain signal (i.e. the watermark may have been stretched or shrunk).

In the correlation and decision stage (400), the content of each bufferis correlated with the reference watermark and the maximum correlationpeaks are compared against a threshold to determine the likelihood ofwhether the watermark is indeed embedded within the received signaly′_(old)[n].

In the control signal generating unit (500), the detection truth value,the corresponding watermark sequence, its buffer index, and the value ofthe new payload pL_(new) (input 616) are combined to generate parametersthat are needed to changing the watermark payload. The outputs (609,612) of the control signal generator are passed on to the watermarkgenerator 650 and the watermark embedding unit 620.

Details of the watermark generating unit 650 are shown in FIG. 5. Inthis unit, w_(new) and w_(old) are generated using parameter informationobtained from the detector 640 (inputs 609 and 612).

In one preferred embodiment, the watermark sequences w_(old) and w_(new)can be generated as follows. Firstly a finite length, preferably zeromean and uniformly distributed random sequence w_(s) is generated usinga random number generator 651 with an initial seed S. It will beappreciated that this initial seed S is preferably the same as that usedin generating w_(old) during the first embedding. This results in thesequence of length L_(w)w _(s) [k]ε[−1,1], for k=0,1,2, . . . , L _(w)−1   (6)

Then the sequence w_(s) is circularly shifted by the amounts d_(old) andd_(new) using the circularly shifting units 653 a and 653 b to obtainthe random sequences w_(old) and w_(new), respectively. It will beappreciated that these two sequences (w_(old) and w_(new)) areeffectively a first sequence and a second sequence, with the secondsequence being circularly shifted with respect to the first. These twosequences are then passed on to the watermark embedding unit 620(outputs 608, 618).

Details of a watermark embedding unit 620 as part of the watermarkre-embedding apparatus 600, and suitable for use with the consideredparticular watermarking scheme, is shown in FIG. 6.

The host signal y′_(old) (containing at least an initial watermarksequence w_(old)) is provided at input 607 of the apparatus. The hostsignal y′_(old) is passed in the direction of output 610 via the delayunit 629 and the adder 626. However, a replica y_(b) of the host signaly′_(old) (input 628) is split off in the direction of the multiplier624, for carrying the new watermark information.

The multiplier 624 is utilized to calculate the product of the watermarkaltering signal w_(diff) and the replica signal y_(b). The watermarkaltering signal w_(diff) is obtained from the payload changing andwatermark conditioning apparatus 630 b, and derived from the watermarkrandom sequences w_(old) and w_(new) (inputs 608 and 618) respectively),which are input to the payload changing and watermark conditioningapparatus.

The resulting product, w_(diff)y_(b) is then passed via a gaincontroller 625 to the adder 626. The gain factor α applied by thecontroller 625 controls the trade off between the audibility and therobustness of the watermark. It may be a constant, or variable in atleast one of time, frequency and space. The apparatus in FIG. 6 showsthat, when α is variable, it can be automatically adapted via a controlsignal 609 obtained from the watermark detector unit 640.

In another preferred embodiment, the gain factor α can be independentlycontrolled via a signal analyzer based upon the properties of the hostsignal y_(old). In the latter case, the gain α is automatically adapted,preferably so as to minimize the impact on the signal quality, accordingto a properly chosen perceptibility cost-function, such as apsychoacoustic model of the human auditory system (HAS). The same modelmay be used as that utilized to control the adaptive gain factor used tocontrol the embedding strength of the original watermark signal (seeFIG. 2 and associated text).

In FIG. 6, the resulting watermark audio signal y_(new) is then obtainedat the output 610 of the embedding apparatus 620 by adding anappropriately scaled version of the product of w_(diff) and y_(b) to thehost signal:y _(new) [n]=y _(old) [n]+αw _(diff) [n]y _(b) [n]≈x[n]+αw _(c) [n]x_(b) [n,   (7)where w_(c) is derived from r_(ref) and w_(new) in the same way as w_(c)was derived from w_(ref) and w_(old) in FIG. 3.

Preferably, the parameters of the watermark w_(diff)[n] are chosen suchthat when multiplied with y_(b), it predominantly modifies the shorttime envelope of y_(b).

FIG. 7 shows one preferred embodiment in which the input 628 to themultiplier 624 in FIG. 6 is obtained by filtering the host signaly′_(old) using a filter H in the filtering unit 615. Preferably, thefilter H is a linear phase band-pass filter characterized by its lowercut off frequency f_(L) and upper cut off frequency f_(H). Preferably,the filter has the same properties as the filter utilized to extractx_(b) from x in the embedder 100.

In FIG. 8, the details of the payload changing andwatermark-conditioning unit 630 b are shown. In this particular unit,the watermark signals w_(old) and w_(new) are combined to generate themulti-bit watermark altering signal w_(diff). The watermark alteringsignal w_(diff), when combined with y′_(old), generates a watermarkedsignal with a payload corresponding to w_(new).

The sequences w_(old) and w_(new) are first multiplied with a respectivesign bit r_(old) and r_(new) in the multiplying units 654 a and 654 b.The respective values of r_(old) and r_(new) are derived from thedetection unit 640 and are passed on via the control input 609. Thevalues of r_(old) and r_(new) remain constant (typically at either +1 or−1), and only change when the payload of the watermark is changed.

The difference w_(diff)[k] between the signed sequences w_(old) andw_(new) is calculated using adder 635 to add the negative of the oldwatermark sequence w_(old) to the positive of the new sequence w_(new).The result w_(diff) is then passed through the conditioning stage togenerate the slowly varying multi-bit watermark w_(diff)[n].

FIG. 9 shows details of the watermark conditioning apparatus 632 used inthe payload adding/changing and watermark conditioning apparatus 630. Inthe case of re-embedding, the watermark random sequencew_(diff)=w_(new)−w_(old) is input to the conditioning apparatus 632.

In the conditioning circuit, the watermark signal sequence w_(diff)[k]is first applied to the input of an up-sampler 180. Chart 181illustrates one of the possible sequences w_(diff) as a sequence ofvalues of random numbers between +1 and −1, with the sequence being oflength L_(w). The up-sampler adds (T_(s)−1) zeros between each sample soas to raise the sampling frequency by the factor T_(s). T_(s) isreferred to as the watermark symbol period and represents the span ofthe watermark symbol in the audio signal. In the case of the receivedsignal y′_(old) having undergone a time scaling compared with thetransmitted signal y_(old), T_(s) is replaced with appropriately scaledsampling factor T_(new) that takes into account the scaling effect.Chart 183 shows the results of the signal illustrated in chart 181 onceit has passed through the up-sampler 180.

A window shaping function s[n], such as a bi-phase window, is thenconvolved with the up-sampled signal w_(i)[n] so as to convert it into aslowly varying narrow-band signal w_(diff)[n], whose behavior for thew_(diff)[k] sequence of chart 181 is as shown in chart 185.

Chart 184 shows a typical bi-phase window shaping function. The windowshaping function has support in the interval 0 to T_(s) only. The windowfunction is applied to the watermark sequence in order to produce asmoothly varying signal, so as to minimize the decrease in the qualityof the host signal.

Below is described in more detail the operation of the detectionapparatus (200, 300, 400, 500) shown in FIG. 4.

In the watermark symbol extraction stage 200 shown in FIG. 4, theincoming watermark signal y′_(old)[n] is input to the signalconditioning filter H_(b)(210). This filter 210 is typically a band passfilter and has the same behavior as the corresponding filter H (615)shown in FIG. 7. The output of the filter H_(b) is y′_(b)[n], andassuming linearity within the transmission channel, it follows fromequation (1)y′ _(b) [n]≈(1+αw _(c) [n])x _(b) [n]  (8)Note that when no filter is used in the embedder (i.e., when H=1) thenH_(b) in the detector can also be omitted, or it can still be includedto improve the detection performance. If H_(b) is omitted, then y′_(b)in equation (8) is replaced with y′_(old). The rest of the processing isthe same.

For simplification, it is assumed that there is perfect synchronismbetween the embedder and the detector (i.e. no offset and no change intimescale), and that the audio signal is divided into frames of lengthT_(s), and that y′_(b,m)[n] is the n-th sample of the m-th frame of thefiltered signal y′_(b)[n]. It should be noted that if there is notperfect synchronism between the embedder and the detector, then anydeviation can be compensated for within the buffering and interpolationstage 300 utilizing techniques known to the skilled person e.g.iteratively searching through all possible scale and offsetmodifications until a best match is achieved.

The energy E[m] corresponding to the y′_(b,m)[n] frame is:$\begin{matrix}{{E\lbrack m\rbrack} = {\sum\limits_{n = 0}^{T_{s} - 1}{{{y_{b,m}^{\prime}\lbrack n\rbrack}{S\lbrack n\rbrack}}}^{2}}} & (9)\end{matrix}$where S[n] is the same window shaping function used in the watermarkconditioning circuit of FIG. 9. A person skilled in the art willappreciate that equation 9 represents a matched filter receiver, and isthe optimum receiver when the symbol period is perfectly synchronized.Not withstanding this fact, from now on, we set S[n]=1 in order tosimplify subsequent explanations.

Combining this with equation 8, it follows that: $\begin{matrix}{{E\lbrack m\rbrack} = {{\sum\limits_{n = 0}^{T_{s} - 1}{{y_{b,m}^{\prime}\lbrack n\rbrack}}^{2}} \approx {\sum\limits_{n = 0}^{T_{s} - 1}{{\left( {1 + {\alpha\quad{w_{e}\lbrack m\rbrack}}} \right){x_{b,m}\lbrack n\rbrack}}}^{2}}}} & (10)\end{matrix}$where w_(e)[m] is the m-th extracted watermark symbol. and containsN_(b) time-multiplexed estimates of the embedded watermark sequences.Solving for w_(e)[m] in equation 10 and ignoring higher order terms ofα, gives the following approximation: $\begin{matrix}{{w_{e}\lbrack m\rbrack} \approx {\frac{1}{2\alpha}\left( {\frac{\sum\limits_{n = 0}^{T_{s} - 1}{{y_{b,m}^{\prime}\lbrack n\rbrack}}^{2}}{\sum\limits_{n = 0}^{T_{s} - 1}{{x_{b,m}\lbrack n\rbrack}}^{2}} - 1} \right)}} & (11)\end{matrix}$

In the watermark extraction stage 200 shown in FIG. 4, the outputy′_(b)[n] of the filter H_(b) is provided as an input to a frame divider220, which divides the audio signal into frames of length T_(s) i.e.into y′_(b,m)[n], with the energy calculating unit 230 then being usedto calculate the energy corresponding to each of the framed signals asper equation (9). The output of this energy calculation unit 230 is thenprovided as an input to the whitening stage H_(w) (240) which performsthe function shown in equation 11 so as to provide an output w_(e)[m].

It will be realized that the denominator of equation 11 contains a termthat requires knowledge of the host (original) signal x. As the signal xis not available to the detector, it means that in order to calculatew_(e)[m] then the denominator of equation 11 must be estimated.

Below is described how such an estimation can be achieved for a bi-phasewindow shaping function, but it will be appreciated that the teachingcould be extended to other window shaping functions.

It will be seen by examination of the bi-phase window function shown inFIG. 9 (chart 184), that when the envelope of an audio frame ismodulated with such a window function, the first and the second halvesof the frame are scaled in opposite directions. In the detector, thisproperty is utilized to estimate the envelope energy of the host signaly′_(old).

Consequently, within the detector, the audio frame is first sub-dividedinto two halves. The energy functions corresponding to the first andsecond half frames are hence given by $\begin{matrix}{{E_{1}\lbrack m\rbrack} = {\sum\limits_{n = 0}^{{T_{s}/2} - 1}{{{y_{b,m}^{\prime}\lbrack n\rbrack}}^{2}\quad{and}}}} & (12) \\{{E_{2}\lbrack m\rbrack} = {\sum\limits_{n = {T_{s}/2}}^{T_{s} - 1}{{y_{b,m}^{\prime}\lbrack n\rbrack}}^{2}}} & (13)\end{matrix}$respectively. As the envelope of the original audio is modulated inopposite directions within the two sub-frames, the original audioenvelope can be approximated as the mean of E₁[m] and E₂[m].

Further, the instantaneous modulation value can be taken as thedifference between these two functions. Thus, for the bi-phase windowfunction, the watermark w_(e)[m] can be approximated by: $\begin{matrix}{{w_{e}\lbrack m\rbrack} \approx {\frac{1}{2\alpha}\left( {\frac{{E_{1}\lbrack m\rbrack} - {E_{2}\lbrack m\rbrack}}{{E_{1}\lbrack m\rbrack} + {E_{2}\lbrack m\rbrack}} - 1} \right)}} & (14)\end{matrix}$This output w_(e)[m] is then passed to the buffering and interpolationstage 300, where the signal is de-multiplexed by a de-multiplexer 310,buffered in buffers 320 of length L_(b) so as to resolve any lack ofsynchronism between the embedder and the detector, and interpolatedwithin the interpolation unit 330 so as to compensate for any time scalemodification between the embedder and the detector. Such compensationcan utilize known techniques, and hence is not described in any moredetail within this specification.

During detection, in order to maximize the accuracy of the watermarkdetection, the watermark detection process is typically carried out overa length of received signal y′_(old)[n] that is 3 to 4 times that of thewatermark sequence length. Thus each watermark symbol to be detected canbe constructed by taking the averages of several symbols. This averagingprocess is referred to as smoothing, and the number of times theaveraging is done is referred to as the smoothing factor s_(f). Thus,the detection window length L_(D) is the length of the audio segment (innumber of samples) over which a watermark detection truth-value isreported. Consequently, L_(D)=s_(f)L_(w)T_(s), where T_(s) is the symbolperiod and L_(w) the number of symbols within the watermark sequence.Typically, the length (L_(b)) of each buffer 320 within the bufferingand interpolation stage is L_(b)=s_(f)L_(w).

As shown in FIG. 4, outputs (w_(D1), w_(D2), . . . w_(DNb)) from thebuffering stage are passed to the interpolation stage and, afterinterpolation, the outputs (w_(I1), w_(I2), . . . w_(INb)) of thisstage, which correspond to the different estimates of the correctlyre-scaled signal, are passed to the correlation and decision stage. Ifit is believed that no time scaling compensation is required, the values(w_(DI), w_(D2), . . . w_(DNb)) can be passed directly to thecorrelation and decision stage 400 i.e. the interpolation stage 330 canbe omitted from the apparatus.

The correlator 410 calculates the correlation of each estimate w_(Ij),j=1, . . . N_(b) with respect to the reference watermark sequencew_(s)[k]. Each respective correlation output corresponding to eachestimate is then applied to the maximum detection unit 420 whichdetermines which two estimates provided the best fits for the circularlyshifted versions w_(old) and w_(ref) of the reference watermark. Thecorrelation values (the peak amplitudes and positions) for theseestimate sequences are passed to the threshold detector and payloadextractor unit 430.

In another output of the correlation stage 410, the watermark sequencesand the buffer indices corresponding to the two best fits for thecircularly shifted versions w_(old) and w_(ref) of the referencewatermark are passed on to the control signal generating unit 500.

If the interpolation stage is omitted, alternatively the correlator 410calculates the correlation of each estimate w_(Dj), j=1, . . . , N_(b)with the reference watermark sequence w_(s)[k] and the results arepassed on for subsequent processing to the units 420 and 430 as outlinedin the above paragraph.

The threshold detector and payload extractor unit 430 may be utilized toextract the payload (e.g. information content) from the detectedwatermark signal. Once the unit has estimated the two correlation peakscL₁ and cL₂ that exceed the detection threshold, the distance pL betweenthe peaks (as defined by equation (2)) is measured. Next, the signs μ₁and μ₂ of the correlation peaks are determined, and hence r_(sign)calculated from equation (3). The overall watermark payload may then becalculated using equation (4).

For instance, it can be seen in FIG. 10 that pL_(old) is the relativedistance between the two peaks. Both peaks are positive i.e. μ₁=+1, andμ₂=+1. From equation (3), r_(sign)=3. Consequently, the payloadpL_(w)=<3, pL_(old)>.

The reference watermark sequence w_(s) used within the detectorcorresponds to (a possibly circularly shifted version of) the originalwatermark sequence applied to the host signal. For instance, if thewatermark signal was calculated using a random number generator withseed S within the embedder, then equally the detector can calculate thesame random number sequence using the same random number generationalgorithm and the same initial seed so as to determine the watermarksignal. Alternatively, the watermark signal originally applied in theembedder and utilized by the detector as a reference could simply be anypredetermined sequence.

FIG. 10 shows a typical shape of a correlation function as output fromthe correlator 410. The horizontal scale shows the correlation delay (interms of the sequence bins). The vertical scale on the left-hand side(referred to as the confidence level cL) represents the value of thecorrelation peak normalized with respect to the standard deviation ofthe (typically normally distributed) correlation function.

As can be seen, the typical correlation is relatively flat with respectto cL, and centered about cL=0. However, the function contains twopeaks, which are separated by pL_(old) (see equation 2) and extendupwards to cL values that are above the detection threshold when awatermark is present. When the correlation peaks are negative, the abovestatement applies to their absolute values.

A horizontal line represents the detection threshold. The detectionthreshold value controls the false alarm rate.

Two kinds of false alarms exist: the false positive rate, defined as theprobability of detecting a watermark in non watermarked items, and thefalse negative rate, which is defined as the probability of notdetecting a watermark in watermarked items. Generally, the requirementof the false positive alarm is more stringent than that of the falsenegative.

After each detection interval, the detector determines whether theoriginal watermark is present or whether it is not present, and on thisbasis output a “yes” or a “no” decision to the outputting device and atthe same time to the control signal generating unit 500.

If desired, to improve this decision making process, a number ofdetection windows may be considered. In such an instance, the falsepositive probability is a combination of the individual probabilitiesfor each detection window considered, dependent upon the desiredcriteria For instance, it could be determined that if the correlationfunction has two peaks above a threshold of cL=7 on any two out of threedetection intervals, then the watermark is deemed to be present.Obviously, such detection criteria can be altered depending upon thedesired use of the watermark signal and to take into account factorssuch as the original quality of the host signal and how badly the signalis likely to be corrupted during normal transmission.

In summary of the general operation of this particular re-embeddingprocess, the re-embedding apparatus 600 is arranged to receive a signaly′_(old) containing a watermark at input 602. The signal y′_(old) inthis instance has been generated by the watermark embedding apparatusshown in FIGS. 2 and 3, and includes a watermark comprising twocircularly shifted versions w_(old) and w_(ref) of a single sequencew_(s) of values. A copy of the received signal y_(old) is passed to thedetector 640.

As described above, the detector 640 is arranged to detect the presenceof the watermark within the signal y′_(old), and to estimate thewatermark embedding parameters (e.g. the amounts d by which thewatermark sequences were circularly shifted, and the gain factor α usedto control the trade off between the audibility and the robustness ofthe watermark).

FIG. 10 illustrates a correlation function of the watermark embeddedwithin y′_(old), with the original sequence of values (w_(s)) used toform w_(old). As can be seen, the payload pL_(old) of the watermark iny′_(old) is, at least in part, defined by the two amounts by which thesequences comprising the watermark have been circularly shifted, doldand d_(ref).

In this preferred embodiment, only a portion of the original watermarksignal is removed (the sequence of values which have been circularlyshifted by the amount d_(old)). The same sequence of values (w_(s)) asutilised in the original watermark is then circularly shifted by a newamount (d_(new)), and embedded within the information signal, using thedetected embedding parameters.

FIG. 11 illustrates a correlation function of the same watermark signalshown in FIG. 10, but in which the sequence of values corresponding todelay d_(old) has been removed (i.e. as if “−w_(old)” had been added toy′_(old)). The result is a single correlation peak at d_(ref).

FIG. 11 shows the correlation function of the same signal shown in FIG.10, but after the same sequence of values w, with a new circular shiftdelay (d_(new)) has been inserted. This results in two correlationpeaks, separated by a payload pL_(new) (see equation 2 and 4) i.e. a newwatermark signal. It will be appreciated that by only removing one halfof the original watermark signal, and subsequently adding in areplacement half of the watermark signal, a new watermark has beengenerated but with minimum impact upon the quality of the informationsignal into which the watermark is embedded.

The detector 640 also provides synchronization information such as thetime offset (Δt) to a delay unit 629. The copy of the input signaly′_(old) passed to the delay unit 629 is then appropriately synchronizedto take into account any time offsets, rescaling and also the intrinsicdelay caused by the various operations carried out by the units in there-embedder (e.g. 630 b, 624, 625, 615), and to ensure that y_(old) issynchronized with y_(b) at the adder 626. One copy of the signaly′_(old) is passed to the filter H 615. This is similar to the filtershown in FIG. 7, and can be omitted. Such a filter, can for instance, bea band-pass filter, and gives an output y_(b).

It will be appreciated that the above embodiments are provided by way ofexample only. Various modifications will be apparent to the skilledperson.

For instance, whilst the preferred embodiment has described the partialremoval of the original watermark, it will be appreciated that the wholeof the original watermark could be removed and replaced. Equally, whilstonly one sequence has been described as being removed and replaced byone sequence, it will be appreciated that a single sequence could bereplaced by two or more sequences. Alternatively if the original payloadcomprised three circularly shifted sequences or more, two such sequencescould be replaced by a single sequence, or a plurality of sequences.

Whilst the new watermark has been described as utilizing the same valuesof embedding parameters as originally used, the new watermark could ofcourse use alternative values for any one of more of the embeddingparameters.

For instance, whilst the above embodiment has described the watermarkaltering signal w_(c) as being scaled by a factor α, it will beappreciated that the two components forming w_(c) (ie. w_(old) andw_(new)) could be scaled by different amounts before being addedtogether, or before being separately added directly to the receivedsignal y′_(old). In order for the portion of the watermark to beremoved, it is desirable that the embedding strength of the negativeversion of w_(old) is similar to that originally used to embed w_(old)in the host signal. However, w_(new) can obviously be embedded intoy′_(old) with any desired strength.

It is desirable that watermarks are embedded into the host signalwithout unduly affecting the quality of the host signal. Preferably, allembedded watermark signals are imperceptible to an observer i.e. in anaudio signal, the effect of the watermark signal can not be heard, or ina video signal, the effect of the watermark signal can not be seen.

The ITU standard “Method for objective Measurements of Perceived audioquality”, International Telecommunication Union, Geneva Switzerland(1999), defines a five grid scoring system (which is in conformation toITU-R Rec BS.1116 (rev. 1) (1997) and ITU-R Rec. BS.562-3 (1990)standards). The various scores are: 5=Imperceptible, 4=Perceptible butnot annoying, 3=Slightly annoying, 2=Annoying, 1=Very annoying.

Whilst it is preferable that all watermark signals are imperceptible,equally, a scoring of 4 on the ITU scale (“perceptible but notannoying”) is acceptable in most systems.

Whilst the above embodiment describes the implementation of the presentinvention with respect to one particular watermarking scheme, it will beappreciated that the invention can in fact be implemented using manyother types of watermarking schemes.

For instance, one type of audio watermarking scheme is to use temporalcorrelation techniques to embed the desired data (e.g. copyrightinformation) into the audio signal.

This technique is effectively an echo-hiding algorithm, in which thestrength of the echo is determined by solving a quadratic equation. Thequadratic equation is generated by auto-correlation values at twopositions: one at delay equal to τ, and one at delay equal to 0. In sucha scheme, as echoes of the audio signal are added to the original audiosignal, the resulting signal is in fact both an amplitude and a phasemodulated version of the original audio signal. At the detector, thewatermark is extracted by determining the ratio of the auto correlationfunction at the two delay positions.

Also known are watermarking schemes based on the amplitude modulation ofDFT (discrete Fourier Transform) co-efficients, that require thecalculation of DFT's at both the encoder and the decoder.

Similarly, WO 98/53565, U.S. Pat. No. 6,175,627 and WO 00/00969 describealternative techniques, to which the present invention could be applied,for embedding or encoding auxiliary signals (such as copyrightinformation) into a multimedia host or cover signal. As detailed in WO00/00969, a replica of the cover signal, or a portion of the coversignal in a particular domain (time, frequency or space), is generatedaccording to a stego key, which specifies modification values to theparameters of the cover signal. The replica signal is then modified byan auxiliary signal corresponding to the information to be embedded, andinserted back into the cover signal so as to form the stego signal.

At the decoder, in order to extract the original auxiliary data, areplica of the stego signal is generated in the same manner as thereplica of the original cover signal, and requires the use of the samestego key. The resulting replica is then correlated with the receivedstego signal, so as to extract the auxiliary signal.

Using an alternative embodiment of the present invention, the extractedauxiliary signal can be replaced by a new one. This can be achieved byappropriately subtracting the auxiliary information from the receivedsignal using the stego key and the embedding parameters estimated usingthe detection unit. In relation to FIG. 1, this can be put into effectby utilizing a detector 640, a watermark generator 650, and a watermarkre-embedder 620 responsive to the underlying embedding algorithm.

It will be appreciated by the skilled person that variousimplementations not specifically described would be understood asfalling within the scope of the present invention. For instance, whilstonly the functionality of the embedding and detecting apparatus has beendescribed, it will be appreciated that the apparatus could be realizedas a digital circuit, an analog circuit, a computer program, or acombination thereof.

Within the specification it will be appreciated that the word“comprising” does not exclude other elements or steps, that “a” or “an”does not exclude a plurality, and that a single processor or other unitmay fulfil the functions of several means recited in the claims.

1. A method of processing a multimedia signal comprising a watermarksignal, the method comprising the steps of: removing at least a portionof an original watermark signal; and adding a new watermark signal tothe multimedia signal so as to form a new watermarked multimedia signal.2. A method as claimed in claim 1 wherein said original watermark signalis removed by applying a negative version of said original watermarksignal to the multimedia signal.
 3. A method as claimed in claim 1,further comprising the step of determining the value of at least one ofthe parameters used to embed the original watermark in the multimediasignal.
 4. A method as claimed in claim 3, wherein said value of saidparameter is utilized to remove at least a portion of said originalwatermark signal.
 5. A method as claimed in claim 3, wherein said newsignal is embedded in the multimedia signal using said embeddingparameters having said determined value.
 6. A method as claimed in claim3, wherein said new signal is embedded in the multimedia signal usingsaid embedding parameters having values other than the said determinedvalues.
 7. A method as claimed in claim 3, wherein said parametercomprises at least one of: embedding strength, synchronizationinformation, time offset, time-scaling, an amount of a circular shift ofa sequence, and a watermark symbol period.
 8. A method as claimed inclaim 1, wherein all of said original watermark is removed, and the newsignal comprises a new watermark signal.
 9. A method as claimed in claim1, wherein said watermark comprises at least two sequences of values, atleast one sequence of values being removed as said portion of thewatermark signal, so as to leave at least one remaining sequence ofvalues from the original watermark signal; and wherein said new signalcomprises at least one further sequence of values which together withsaid remaining sequence forms a new watermark signal.
 10. A method asclaimed in claim 9, wherein all of said sequences of values are formedfrom a single sequence of values which has been circularly shifted bydifferent amounts.
 11. A method as claimed in claim 1, wherein saidremoved portion of the watermark signal was embedded with apredetermined strength into said multimedia signal, the methodcomprising the step of embedding the new signal into the multimediasignal with preferably the same predetermined strength.
 12. A method asclaimed in claim 11, wherein the embedding strength is such that thedegradation in the quality of the new watermarked multimedia signal isperceptible but not annoying.
 13. A method as claimed in claim 1,wherein at least one of the original watermark signal and the newwatermark signal comprises a smoothly varying signal formed by applyinga window shaping function to a sequence of values, the integral over thewindow shaping function being zero.
 14. A method as claimed in claim 13,wherein the window shaping function has a bi-phase behavior.
 15. Amethod as claimed in claim 14, wherein the bi-phase window comprises atleast two Hanning windows of opposite polarities.
 16. A computer programarranged to perform the method of claim
 1. 17. A record carriercomprising a computer program as claimed in claim
 16. 18. A method ofmaking available for downloading a computer program as claimed in claim16.
 19. An apparatus for processing a multimedia signal comprising awatermark signal, the apparatus comprising: a deletion unit arrange toremove at least a portion of the watermark signal; and an embedderarranged to add a new signal to the multimedia signal so as to form anew watermark signal.
 20. An apparatus as claimed in claim 19, furthercomprising a detector arranged to detect at least one value of aparameter of said watermark signal.
 21. A receiver of a multimediasignal comprising an apparatus as claimed in claim 19.